Finding Originally Mislabels with MD-ELM

نویسندگان

  • Anton Akusok
  • David Veganzones
  • Yoan Miché
  • Eric Séverin
  • Amaury Lendasse
چکیده

This paper presents a methodology which aims at detecting mislabeled samples, with a practical example in the field of bankruptcy prediction. Mislabeled samples are found in many classification problems and can bias the training of the desired classifier. This paper proposes a new method based on Extreme Learning Machine (ELM) which allows for identification of the most probable mislabeled samples. Two datasets are used in order to validate and test the proposed methodology: a toy example (XOR problem) and a real dataset from corporate finance (bankruptcy prediction).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TROP-ELM: A double-regularized ELM using LARS and Tikhonov regularization

In this paper an improvement of the optimally pruned extreme learning machine (OP-ELM) in the form of a L2 regularization penalty applied within the OP-ELM is proposed. The OP-ELM originally proposes a wrapper methodology around the extreme learning machine (ELM) meant to reduce the sensitivity of the ELM to irrelevant variables and obtain more parsimonious models thanks to neuron pruning. The ...

متن کامل

Dissimilarity based ensemble of extreme learning machine for gene expression data classification

Extreme Learning Machine (ELM) has salient features such as fast learning speed and excellent generalization performance. However, a single extreme learning machine is unstable in data classification. To overcome this drawback, more and more researchers consider using ensemble of ELMs. This paper proposes a method integrating voting-based extreme learning machines (V-ELM) with dissimilarity (D-...

متن کامل

Classification of Hippocampal Region using Extreme Learning Machine

Important brain parts like hippocampal usually being manually segmented by doctors. But with the introduction of hybrid between machine learning along with neuroimaging technique, it has proved to shows some promising results regarding on segmenting subcortical structures. However, it is known that Extreme Learning Machine (ELM) is to be superior machine learning technique. This study will inve...

متن کامل

Phylogeny-aware identification and correction of taxonomically mislabeled sequences

Molecular sequences in public databases are mostly annotated by the submitting authors without further validation. This procedure can generate erroneous taxonomic sequence labels. Mislabeled sequences are hard to identify, and they can induce downstream errors because new sequences are typically annotated using existing ones. Furthermore, taxonomic mislabelings in reference sequence databases c...

متن کامل

Finding an Appropriate Cut-off Point for Neck Circumference to Determine Overweight and Obesity in a Large Sample of Iranian Adults

Objective: Obesity is a major public health concern and there are different ways to detect it in population. The aim of the present study is to evaluate the neck circumference (NC) in a simple and practical way. Materials and Methods: This cross-sectional survey utilized data from the Yazd Health Study (YaHS) which is a population-based cohort study. In brief, 9962 individuals aged 20-70 years...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014